Skip to content

[AutoParallel] pp dataloader align mode#73941

Merged
xuxinyi389 merged 9 commits intoPaddlePaddle:developfrom
zty-king:add_align_mode_for_data_process
Jul 31, 2025
Merged

[AutoParallel] pp dataloader align mode#73941
xuxinyi389 merged 9 commits intoPaddlePaddle:developfrom
zty-king:add_align_mode_for_data_process

Conversation

@zty-king
Copy link
Contributor

@zty-king zty-king commented Jul 9, 2025

PR Category

Auto Parallel

PR Types

Others

Description

添加动半向动手对齐时,dp下dataloader的逻辑对齐,当前两种模式的逻辑如下图所示:
image

@paddle-bot
Copy link

paddle-bot bot commented Jul 9, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Jul 9, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jul 15, 2025

Codecov Report

❌ Patch coverage is 21.42857% with 11 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@0fbc27b). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...distributed/auto_parallel/pipelining/microbatch.py 21.42% 11 Missing ⚠️

❌ Your patch status has failed because the patch coverage (21.42%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #73941   +/-   ##
==========================================
  Coverage           ?   21.42%           
==========================================
  Files              ?        1           
  Lines              ?       14           
  Branches           ?        0           
==========================================
  Hits               ?        3           
  Misses             ?       11           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zty-king
Copy link
Contributor Author

image CI单测有点问题,这里放本地测试coverage的截图

@paddle-ci-bot
Copy link

paddle-ci-bot bot commented Jul 24, 2025

Sorry to inform you that b2b9f60's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

dp_pp_align_mode_losses,
dp_pp_losses,
rtol=1e-5,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

喂入顺序完全一致时,是否第一个step的前向loss,两种情况下的md5需要完全相同?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

image image image

@xuxinyi389 xuxinyi389 changed the title dataloader与动手逻辑对齐 [AutoParallel] pp dataloader align mode Jul 30, 2025
@zty-king
Copy link
Contributor Author

/re-run all-failed

Copy link
Contributor

@xuxinyi389 xuxinyi389 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@From00 From00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xuxinyi389
Copy link
Contributor

/re-run all-failed

@xuxinyi389 xuxinyi389 merged commit 4d4fdc4 into PaddlePaddle:develop Jul 31, 2025
91 of 98 checks passed
@zty-king zty-king deleted the add_align_mode_for_data_process branch November 23, 2025 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants